DNA sequencing method

ABSTRACT

The present invention pertains to a method for determining the sequence of a polynucleotide, the method relying on the detection of a conformational change in an enzyme that interacts with and processes along the polynucleotide. The detection of a conformational change may be carried out by measuring changes in a fluorophore bound to the enzyme.

FIELD OF THE INVENTION

This invention relates to polynucleotide sequence determinations.

BACKGROUND OF THE INVENTION

The ability to determine the sequence of a polynucleotide is of greatscientific importance. For example, the Human Genome Project is anambitious international effort to map and sequence the three billionbases of DNA encoded in the human genome. When complete, the resultingsequence database will be a tool of unparalleled power for biomedicalresearch. The major obstacle to the successful completion of thisproject concerns the technology used in the sequencing process.

The principle method in general use for large-scale DNA sequencing isthe chain termination method. This method was first developed by Sangerand Coulson (Sanger et al., Proc. Natl. Acad. Sci. USA, 1977; 74:5463-5467), and relies on the use of dideoxy derivatives of the fournucleoside triphosphates which are incorporated into a nascentpolynucleotide chain in a polymerase reaction. Upon incorporation, thedideoxy derivatives terminate the polymerase reaction and the productsare then separated by gel electrophoresis and analysed to reveal theposition at which the particular dideoxy derivative was incorporatedinto the chain.

Although this method is widely used and produces reliable results, it isrecognised that it is slow, labour-intensive and expensive.

An alternative sequencing method is proposed in EP-A-0471732, which usesspectroscopic means to detect the incorporation of a nucleotide into anascent polynucleotide strand complementary to a target. The methodrelies on an immobilised complex of template and primer, which isexposed to a flow containing only one of the different nucleotides.Spectroscopic techniques are then used to measure a time-dependentsignal arising from the polymerase catalysed growth of the templatecopy. The spectroscopic techniques described are surface plasmonresonance (SPR) spectroscopy, which measures changes in an analytewithin an evanescent wave field, and fluorescence measuring techniques.However, limitations of this method are recognised; the most serious forthe SPR technique being that, as the size of the copy strand grows, theabsolute size of the signal also grows due to the movement of the strandout of the evanescent wave field, making it harder to detect increments.The fluorescence measuring techniques have the disadvantage ofincreasing background interference from the fluorophores incorporated onthe growing nascent polynucleotide chain. As the chain grows, thebackground “noise” increases and the time required to detect eachnucleotide incorporation needs to be increased. This severely restrictsthe use of the method for sequencing large polynucleotides.

Single fragment polynucleotide sequencing approaches are outlined inWO-A-9924797 and WO-A-9833939, both of which employ fluorescentdetection of single labelled nucleotide molecules. These singlenucleotides are cleaved from the template polynucleotide, held in a flowby an optical trap (Jett, et al., J. Biomol. Struc. Dyn, 1989;7:301-309), by the action of an exonuclease molecule. These cleavednucleotides then flow downstream within a quartz flow cell, aresubjected to laser excitation and then detected by a sensitive detectionsystem. However, limitations of this method are recognised; the mostserious for the exonuclease technique being the fact that the labellednucleotides severely affect the processivity of the exonuclease enzyme.Other limitations of this method include ‘sticking’ of the nucleotide(s)to the biotin bead used to immobilise the polynucleotide fragment, thusresulting in the nucleotide flow becoming out of phase; inefficiency andlength limitation of the initial enzymatic labelling process; and theexcitation ‘cross-over’ between the four different dye moleculesresulting in a greatly increased error rate.

There is therefore a need for an improved method, preferably at thesingle fragment level, for determining the sequence of polynucleotides,which significantly increases the rate and fragment size of thepolynucleotide sequenced and which is preferably carried out by anautomated process, reducing the complexity and cost associated withexisting methods.

SUMMARY OF THE INVENTION

The present invention is based on the realisation that the sequence of atarget polynucleotide can be determined by measuring conformationalchanges in an enzyme that binds to and processes along the targetpolynucleotide. The extent of the conformational change that takes placeis different depending on which individual nucleotide on the target isin contact with the enzyme.

According to one aspect of the present invention, a method fordetermining the sequence of a polynucleotide comprises the steps of:

(i) reacting a target polynucleotide with an enzyme that is capable ofinteracting with and precessing along the polynucleotide, underconditions sufficient to induce the enzyme activity; and

(ii) detecting conformational changes in the enzyme as the enzymeprocesses along the polynucleotide.

In a preferred embodiment, the enzyme is a polymerase enzyme whichinteracts with the target in the process of extending a complementarystrand. The enzyme is typically immobilised on a solid support tolocalise the reaction within a defined area.

According to a second embodiment of the invention, the enzyme comprisesa first bound detectable label, the characteristics of which alter asthe enzyme undergoes a conformational change. The enzyme may alsocomprise a second bound detectable label capable of interacting with thefirst label, wherein the degree of interaction is dependent on aconformational change in the enzyme. Typically, the first label is anenergy acceptor and the second label is an energy donor, and detectingthe conformational change is carried out by measuring energy transferbetween the two labels.

According to a further embodiment of the invention, fluorescenceresonance energy transfer (FRET) is used to detect a conformationalchange in an enzyme that interacts with and processes along a targetpolymerase, thereby determining the sequence of the polynucleotide.Fluorescence resonance energy transfer may be carried out between FRETdonor and acceptor labels, each bound to the enzyme. Alternatively, oneof the labels may be bound to the enzyme and the other label bound tothe polynucleotide.

According to a further embodiment, there is the use of adetectably-labelled enzyme, capable of interacting with and precessingalong a target polynucleotide, to determine the sequence of thepolynucleotide, wherein the label alters its detectable characteristicsas the enzyme processes along the polynucleotide.

According to a further aspect, a solid support comprises at least oneimmobilised enzyme capable of interacting with and precessing along atarget polynucleotide, the enzyme being labelled with one or moredetectable labels.

According to a further aspect, a system for determining the sequence ofa polynucleotide comprises a solid support as defined above, and anapparatus for detecting the label.

The present invention offers several advantages over conventionalsequencing technology. Once a polymerase enzyme begins its round ofpolynucleotide elongation, it tends to polymerase several thousandnucleotides before falling off from the strand. Additionally, certainspecific polymerase systems are able to anchor or tether themselves tothe template polynucleotide via a ‘sliding clamp’ (e.g. Polymerase III)which encircles the template molecule or via a molecular hook (e.g.T7:thireodoxin complex) which partially encircles the template.

The invention may also enable tens of kilobases (kb) or more to besequenced in one go, at a rate of hundreds of base pairs per second.This is a result of sequencing on a single fragment of DNA. An advantageof sequencing a single fragment of DNA is that sequencing rates aredetermined by the enzyme system utilised and not upon indirect, summatedreactions, and are therefore correspondingly higher. Just as importantas the high rate is the ability to sequence large fragments of DNA. Thiswill significantly reduce the amount of subcloning and the number ofoverlapping sequences required to assemble megabase segments ofsequencing information. An additional advantage of the single fragmentapproach is the elimination of problems associated with the disposal ofhazardous wastes, such as acrylamide, which plague current sequencingefforts.

DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic illustration of a confocal microscope setup foruse in the invention;

FIG. 2 illustrates a trace taken after fluorescence resonance energytransfer, with each of the peaks representing the detection of aspecific nucleotide.

DESCRIPTION OF THE INVENTION

The present method for sequencing a polynucleotide involves the analysisof conformational changes between an enzyme and a target polynucleotide.

The term “polynucleotide” as used herein is to be interpreted broadly,and includes DNA and RNA, including modified DNA and RNA, as well asother hybridising nucleic acid-like molecules, e.g. peptide nucleic acid(PNA).

The enzyme may be a polymerase enzyme, and a conformational change isbrought about when the polymerase incorporates a nucleotide into anascent strand complementary to the target polynucleotide. It has beenfound that the conformational change will be different for each of thedifferent nucleotides, A, T, G or C and therefore measuring the changewill identify which nucleotide is incorporated.

Alternatively, the enzyme may be any that is involved in an interactionwith a polynucleotide, e.g. a helicase enzyme, primase and holoenzyme.As the enzyme processes along the polynucleotide, its conformation willchange depending on which nucleotide on the target it is brought intocontact with.

One way of detecting a conformational change in the enzyme is to measureresonance energy transfer between a suitable energy donor label anda:suitable energy acceptor label. In one example the donor and acceptorare each bound to the enzyme and the conformational change in the enzymebrought about by its interaction with the target polynucleotide altersthe relative positioning of the labels. The differences in positioningare reflected in the resulting energy transfer and are characteristic ofthe particular nucleotide in contact with the enzyme. Alternatively, onelabel may be positioned on the enzyme and the other on a nucleotide ofthe target or on a nucleotide incorporated onto a strand complementaryto the target.

The use of fluorescence resonance energy transfer (FRET) is a preferredembodiment of the invention. This technique is capable of measuringdistances on the 2- to 8 nm scale and relies on the distance-dependentenergy transfer between a donor fluorophore and an acceptor fluorophore.The technique not only has superior static co-localization capabilitiesbut can also provide information on dynamic changes in the distance ororientation between the two fluorophores for intramolecular andintermolecular FRET. Since the first measurement of energy transferbetween a single donor and a single acceptor (single pair FRET) (Ha, etal., Proc. Natl. Acad. Sci. USA, 1996; 96:893), it has been used tostudy ligand-receptor co-localisation (Schutz, et al., Biophys. J.,1998; 74:2223), to probe equilibrium protein structural fluctuations andenzyme-substrate interactions during catalysis (Ha, et al., 1999 Supra),and to identify conformational states and sub-populations of individualdiffusing molecules in solutions. All of these variables are envisionedas applicable within the context of the invention.

The present invention may also be carried out using measurementtechniques that require only a single label. Any system that is capableof measuring changes in the local environment of the enzyme at thesingle molecule level, is an accepted embodiment of the invention.Various properties of single fluorescent probes attached to apolynucleotide processive enzyme and/or its substrate(s) can beexploited in the context of the invention to provide data on variableswithin or in close proximity to the enzyme system/molecular environmentthat are specific to a nucleotide incorporation event. Such variablesinclude, but are not limited to, molecular interactions, enzymaticactivity, reaction kinetics, conformational dynamics, molecular freedomof motion, and alterations in activity and in chemical and electrostaticenvironment.

For example, the absorption and emission transition dipoles of singlefluorophores can be determined by using polarized excitation light or byanalysing the emission polarisation, or both. The temporal variation indipole orientation of a rigidly attached or rotationally diffusingtethered label can report on the angular motion of a macromoleculesystem or one of its subunits (Warshaw, et al., Proc. Natl. Acad. Sci.USA, 1998; 95:8034) and therefore may be applied in the presentinvention.

The labels that may be used in the present invention will be apparent tothose skilled in the art. Preferably, the label is a fluorescence label,such as those disclosed in Xue, et al., Nature, 1995; 373:681.Alternatively, fluorescing enzymes such as green fluorescent protein(Lu, et al., Science, 1998; 282:1877) can be employed.

The preferred embodiment of the invention, however, involves the use ofsmall fluorescence molecules that are covalently and site-specificallyattached to the polynucleotide processive enzyme, e.g.tetramethylrhodamine (TMR).

If fluorescent labels are used in the invention, their detection may beaffected by photobleaching caused by repeated exposure to excitationwavelengths. One possible way to avoid this problem is to carry out manysequential reactions, but detecting fluorescence signals on only a fewat a time. Using this iterative process, the correct sequence of signalscan be determined and the polynucleotide sequence determined. Forexample, by immobilising a plurality of enzymes on a solid support andcontacting them with the target polynucleotide, the sequencing reactionsshould start at approximately the same time. Excitation and detection offluorescence can be localised to a proportion of the total reactions,for a time until photobleaching becomes evident. At this time,excitation and detection can be transferred to a different proportion ofthe reactions to continue the sequencing. As all the reactions arerelatively in phase, the correct sequence should be obtained withminimal sequence re-assembly.

The labels may be attached to the enzymes by covalent or other linkages.A number of strategies may be used to attach the labels to the enzyme.Strategies include the use of site-directed mutagenesis and unnaturalamino acid mutagenesis (Anthony-Cahil' et al., Trends Biochem. Sci.,1989; 14:400) to introduce cysteine and ketone handles for specific andorthogonal dye labelling proteins (Cornish, et al., Proc. Natl. Acad.Sci. USA, 1994; 91:2910).

Another foreseen embodiment used to tag the polynucleotide processiveenzyme is the fusion of green fluorescent protein (GFP) to theprocessive enzyme (e.g. polymerase) via molecular cloning techniquesknown in the art (Pierce, D. W. et al., Nature, 1997; 388:338). Thistechnique has been demonstrated to be applicable to the measurement ofconformational changes (Miyawaki, et al., Nature, 1997; 388:882) andlocal pH changes (Llopis, et al., Proc. Natl. Acad. Sci. USA, 1998;95:6803).

Supports suitable for use in immobilising the enzymes, will be apparentto the skilled person. Silicon, glass and ceramics materials may beused. The support will usually be a planar surface. Enzymeimmobilisation may be carried out by covalent or other means. Forexample, covalent linker molecules may be used to attach to a suitablyprepared enzyme. Attachment methods are known to the skilled person.

There may be one or more enzymes immobilised to the solid support. In apreferred embodiment, there are a plurality of enzymes attached. Thisallows monitoring of many separate reactions, and may be useful toovercome photobleaching problems as outlined above.

A variety of techniques may be used to measure a conformational changein the enzyme. Resonance energy transfer may be measured by thetechniques of surface plasmon resonance (SPR) or fluorescent surfaceplasmon resonance.

However, other techniques which measure changes in radiation viainteraction with a ‘label’ or energy transducer may be considered, forexample spectroscopy by total internal reflectance fluorescence (TIRF),attenuated total reflection (ATR), frustrated total reflection (FTR),Brewster angle reflectometry, scattered total internal reflection(STIR), fluorescence lifetime imaging microscopy and spectroscopy(FLIMS), fluorescence polarisation anisotrophy (FPA), fluorescencespectroscopy, or evanescent wave ellipsometry.

The invention will now be illustrated further by the following Example,with reference to the accompanying drawings.

EXAMPLE

This Example used a confocal fluorescence setup, as shown in FIG. 1.

With reference to FIG. 1, the setup consists of a scan table (1) able toscan at high resolution in X, Y and Z dimensions, a class coverslip (2)which is part of a microfluidic flow cell system with an inlet (8) forintroducing the primer-template polynucleotide complex (4) andnucleotides over the immobilised (9) polymerase molecule (3) within abuffer, and an outlet (7) for waste. Incident light from a laser lightsource (6) for donor excitation is delivered via an oil-immersionobjective (5).

Protein Conjugation

In this experiment, Tetramethylrhodamine (TMR, donor) and Cy5 (acceptor)where used as the FRET pair. This was due to their well separatedemission wavelengths (>100 nm) and large Föster radius.

T7 DNA Polymerase from New England Biolabs (supplied at 10 000 U/ml) wasused. 50 μl of T7 was buffer-exchanged in a Vivaspin 500 (Vivaspin)against 4×500 μl of 200 mM Sodium Acetate buffer at pH 4 in order toremove the DTT from the storage buffer that the T7 DNA Polymerase issupplied in. Then, 50 μl of the buffer-exchanged T7 DNA polymerase wasadded to 100 μl of Sodium Acetate buffer at pH 4 and 50 μl saturated2-2-DiPyridyl-DiSulphide in aqueous solution. This reaction was thenleft for 110 minutes and the absorption at 343 nm noted. Finally, thesample was then buffer-exchanged into 200 mM Tris at pH 8 as before (4times 500 μl).

Dye attachment was verified by denaturing polyacrylamide gelelectrophoresis. Cy5 succinimidyl ester (Molecular Probes) wasconjugated to the TMR-T7 DNA Polymerase under the same labellingconditions and purified and characterized as described above.

Polymerase Immobilization

Glass coverslips were derivatized with N-[(3-trimethoxysilyl)propyl]ethylenediamine triacetic acid. The coverslip was then glued into aflow-cell arrangement that allowed buffer to be flowed continuously overthe derivatized glass surface. The labelled polymerase was then added tothe buffer and allowed to flow over the coverslip so that protein wasimmobilised on the glass surface.

Proteins were then immobilised on the glass-water interface with lowdensity so that only one molecule was under the laser excitation volumeat any one time. Laser light (514 nm Ar ion laser, 15 μW, circularlypolarized) was focused onto a 0.4 μm spot using an oil immersionobjective in an epi-illumination setup of a scanning-stage confocalmicroscope. The fluorescence emission was collected by the sameobjective and divided into two by a dichroic beam splitter (long pass at630 nm) and detected by two Avalanche Photo Diode (APD) counting units,simultaneously.

A 585 nm band pass filter was placed in front of the donor detector; a650 nm long pass filter was placed in front of the acceptor detector.Since the spectral ranges during fluorescence detection are sufficientlyremoved from the cutoff wavelength of the dichroic beam splitter, thepolarization dependence of the detection efficiency of both donor andacceptor signal is negligible. It has been shown that the polarizationmixing due to the high near field aperture (NA) objective can beoverlooked (Ha et al, Supra).

In order to acquire donor and acceptor emission times, a searchcondition on the acceptor signal was employed as outlined in (Ha, etal., Appl. Phys. Lett., 1997; 70:782). This procedure aids in thescreening of doubly labelled proteins: with no direct excitation of theacceptor, only proteins experiencing FRET could show acceptor signal.Once a protein was screened, located and positioned under the laserspot, donor and acceptor time traces (5 ms integration time) wereacquired. The acquisition time lasted until all the fluorescent labelson the target protein where photobleached.

Reaction Initiation

Two oligonucleotides were synthesised using standard phosphoramiditechemistry. The oligonucleotide defined as SEQ ID NO: 1 was used as thetarget polynucleotide, and the oligonucleotide defined as SEQ ID NO: 2was used as the primer. The two oligonucleotides were reacted underhybridizing conditions to form the target-primer complex.

CAAGGAGAGGACGCTGTCTGTCGAAGGTAAGGAACGGACGAGAGAAGGGAGAG SEQ ID NO: 1CTCTCCCTTCTCTCGTC SEQ ID NO: 2

The reaction was then initiated by injecting the primed DNA into theflow cell with all four nucleotides (dGTP, dCTP, dATP and dTTP) presentat a concentration of 0.4 mM. The flow cell was maintained at 25 degreesCelsius by a modified peltier device.

An oxygen-scavenging system was also employed [50 μg/ml glucose oxidase,10 μg/ml catalase, 18% (wt/wt) glucose, 1% (wt/vol) β-mercaptoethanol]to prolong fluorescent lifetimes (Funatsu, et al., Nature, 1995;374:555-559).

FRET Data Analysis

Initial studies have determined the origins of blinking, photobleachingand triplet state spikes (Ha, et al., Chem. Phy., 1999; 247:107-118),all of which can interfere with the underlying changes in FRETefficiency, due to distance changes between fluorophores as a result ofconformational changes. After subtracting the background signal fromdonor and acceptor time traces as disclosed in Ha, et al., 1999 (Supra),a median filter, with five points average, was applied to remove tripletspikes. Next, data points that showed simultaneous dark counts on bothdetectors due to donor blinking events were disregarded from the timetraces.

The amount of donor signal recovery upon acceptor photobleaching isrelated to the quantum yields of the molecules and their overalldetection efficiencies.

Energy transfer efficiency time trace was then obtained. The FRETefficiency time trace during polymerization of target strand SEQ ID NO:1 shown in FIG. 2. Reading FIG. 2, the sequence corresponds to thecomplement of that of SEQ ID NO: 1 (reading right to left, minus thatpart which hybridises to the primer sequence).

1. A method for determining the sequence of a polynucleotide, comprisingthe steps of: i. reacting a target polynucleotide with an enzyme thatinteracts with and processes along the polynucleotide, under conditionssufficient to induce enzyme activity; and ii. detecting conformationalchanges in the enzyme as the enzyme processes along the polynucleotide,and thereby determining the sequence of the polynucleotide; wherein theenzyme comprises a bound fluorescent molecule, the characteristics ofwhich alter as the enzyme undergoes a conformational change, and whereinthe target polynucleotide does not comprise a label prior to, during, orafter the enzyme processes along the polynucleotide, and wherein if step(i) is carried out in the presence of polynucleotide monomers, thepolynucleotide monomers do not comprise a label.
 2. The method accordingto claim 1, wherein the enzyme is a polymerase enzyme.
 3. The methodaccording to claim 1, wherein the enzyme is a helicase enzyme or aprimase enzyme.
 4. The method according to claim 1, wherein the enzymeis immobilised on a solid support.
 5. The method according to claim 4,comprising a plurality of enzymes immobilised on the solid support. 6.The method according to claim 1, wherein the enzyme comprises a boundlabel that interacts with the bound fluorescent molecule, wherein thedegree of infraction is dependent on a conformational change in theenzyme.
 7. The method according to claim 6, wherein the boundfluorescent molecule is an energy acceptor and the bound label is anenergy donor, or wherein the bound fluorescent molecule is an energydonor and the bound label is an energy acceptor, and wherein step (ii)is carried out by measuring energy transfer between the boundfluorescent molecule and the bound label.
 8. The method according toclaim 1, wherein step (ii) is carried out using confocal microscopy. 9.The method according to claim 8, wherein step (ii) is carried out byfluorescence imaging.
 10. The method according to claim 1, wherein step(ii) is carried out by measuring a polarisation effect consequent on thealtered characteristics of the bound fluorescent molecule.
 11. Themethod according to claim 10, wherein step (ii) is carried out byfluorescence polarisation anisotrophy.
 12. A solid support comprising atleast one immobilised polymerase or helicase enzyme, the enzyme beinglabelled with at least one fluorescence resonance energy transfer (FRET)donor label and at least one FRET acceptor label.
 13. The solid supportaccording to claim 12, wherein the at least one fluorescence resonanceenergy transfer donor label is a fluorophore.
 14. A system fordetermining a sequence of a polynucleotide, comprising a solid supportcomprising at least one immobilised polymerase or helicase enzyme, theenzyme being labelled with at least one fluorescence resonance energytransfer (FRET) donor label and at least one FRET acceptor label, and anapparatus for detecting the label.
 15. The solid support according toclaim 12, wherein the at least one FRET: fluorescence resonance energytransfer acceptor label is a fluorophore.